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1.0 Introduction 


Metal-oxide semiconductor field-effect transistors (MOSFETs) are used extensively in flight hardware 
and ground support equipment. In the quest for faster switching times and lower “on resistance,” the 
MOSFETs designed from 1998 to the present have achieved most of their intended goals. Unfortunately, 
along with the good (higher power efficiencies and lower total mass in DC - DC converters, and high 
speed switches) has come the bad. In the quest for lower on resistance and higher switching speeds, the 
designs now being produced allow the charge-carrier dominated region (once small and outside of the 
area of concern) to become important and inside the safe operating area (SOA). The charge- carrier 
dominated region allows more current to flow as the temperature increases. The higher temperatures 
produce more current resulting in the beginning of thermal runaway. While the problem may start with 
the entire part, as the runaway progresses, a hot spot starts to form then becomes smaller in size. With 
more power in a smaller area the temperature rises even higher and faster. The smaller hot spot produces 
higher temperatures resulting in the failure of the part. Temperatures above 450°C on any location within 
the part will cause the metals to begin migrating causing a fatal short. 

Earlier MOSFETs were primarily run in the mobility-charge dominated region. While maintaining the 
same gate voltage, the mobility-charge dominated region cuts back on the current as the temperatures 
increase, in turn decreasing the current allowing for the system to have negative feedback away from the 
thermal runaway. Indeed when the new power MOSFETs have high gate voltages the parts are mobility- 
charge dominated. It has been the unspoken intent of the manufacturers to keep the MOSFETs in the 
mobility-charge dominated region, as they are when used as a high speed switch. The older parts have a 
charge-carrier dominated area. The area, however, is outside the normal SOA and failures occur for other 
reasons. 


2.0 Background 

During a recent board-level test of a radiation protection circuit in a power supply being built for the 
James Webb Space Telescope (JWST), a MOSFET quickly failed. The protection circuit should have 
sheltered the MOSFET instead of causing it to fail. Since the test was unusual, it was originally believed 
that the test itself induced the failure. A replacement MOSFET was installed and the test was rerun. The 
second MOSFET failed during diagnostic electrical probing. Review of the parts revealed that both had 
failed from thermal stress caused by an apparent thermal runaway. Temperature internal to the parts had 
to have exceeded 450°C as indicated by melted internal aluminum spheres (Figure 1) and discoloration of 
the parts’ die in localized areas (Figure 2). 




Figure 1. Aluminum sphere formed from overheating. 
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Figure 2. Failure area of first MOSFET. 

The “bulls-eye” pattern in the photographs led to suggestion that the failure mode for the two 
MOSFETs were common and was caused by the MOSFETs being placed in a thermal runaway condition 
when the gate voltage was low yet well within the SOA for the MOSFETs. This problem, known as 
“thermal instability,” has been known to the automotive industry since the year 1997 (when advanced 
very fast switching MOSFET devices became available). Numerous published articles in MOSFET 
engineering literature acknowledge the problem, but is not recorded in application notes or in the parts’ 
data sheets from the manufacturers. When questioned, the manufacturer, International Rectifier (IR), 
responded that they believed their parts were only being used in a switching mode operation (high gate 
voltage) and not in areas where the gate voltage was low. 

2.1 MOSFET Failures Inside the Advertised SOA 

A consultant was requested to review the problem felt to be involving the thermal runaway. This same 
problem was observed at the Jet Propulsion Laboratory (JPL) in 2003 and was labeled as a “Thermal 
Instability inside the advertised SOA.” The 2003 JPL failure had built a “Protection Circuit” and 
destroyed the MOSFET every time the circuit was tested. JPL looked into this destruction, talked to the 
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manufacturer, and discovered the auto industry had found the problem in 1997. JPL then reverted to 
“older parts,” and trusted the manufacturer to advertise the problem; however, this never occurred. 

The automobile industry found MOSFETs would short when used in protection circuits and variable 
speed fan controllers. Manufacturers, the automobile industry, and the Institute of Electrical & Electronics 
Engineers, Inc. (IEEE) produced papers on the problem from 2000 to present. 

Goddard Space Flight Center (GSFC) identified the problem in October 2008 during two different 
projects. Magnetospheric Multiscale (MMS) learned of the thermal runaway inside the advertised SOA 
during an MMS Spacecraft Power Review. JWST learned of this problem during bench testing of a power 
supply “protection circuit.” 

Thermal runaway is a problem affecting a wide range of modem MOSFETs from more than one 
manufacturer. Older parts also show thermal runaway, but well outside the SOA. Thermal runaway is 
currently over a larger area of the Vd - Id plane and inside the advertised SOA affecting most modem 
power MOSFETs. Refer to Figures 3 and 4. 



Figure 3. Vcs~ Fay Planes. 



8VGS @ 25C 
8V gs @ 125C 
6vGS@ 25 & 125C 
3.0V gs @ 125C 
3.0 V gs @ 25C 


naway 


Region of thermal runaway Time Duration needed for Thermal Runaway 

INSIDE advertised SOA —Longer Shorter — 

Figure 4. Maximum safe voltage. 


From Figures 3 and 4 it can be seen that as the Vds, or the thermal resistance, goes up and the SOA 
goes down. 
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If the Vds and thermal resistance of the system is low enough no problem occurs. However, for any 
given Vds as the system moves away from the ideal, the thermal resistance increases and the thermal 
instability begins to be a problem. 

All of the above has been made with the assumption that time is not involved and that the system is 
static. This is the starting point for the first analysis; however, the thermal resistance is time dependant. 
The failure mechanism reverts to a single cell reaching a temperature that causes the part to fail. If the 
pulse of power is short enough, the temperature does not have time to reach a dangerous level. This is 
effectively why the part can withstand being turned on and off repeatedly; its intended function. If, 
however, the pulse of power is longer, the part has time to reach failure temperatures, even with lower 
power levels. 

With knowing the thermal resistance and where the voltages become critical, it is possible to begin 
review of the SOA for the parts in question. The standard SOA chart (see Figure 5) is composed of four 
sets of boundaries. They are: 

1. The boundary defined by the internal resistance of the part which is physically impossible to 

exceed. 

2. The maximum allowable current the part can tolerate (bond wire fusing being a limiting factor). 

3. The maximum reverse voltage, mainly limited by the voltage breakdown of the diffusion layer. 

4. Maximum allowable power in the part caused by normal switching mode operation and heat 

dissipation. 


— 

\ \ 

\ 4\ \ 


\ 


\ 


1. Resistance Limit 

2. Max. Current Limit 

3. Max Voltage Limit 

4. Max Power Limit with Time 
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Figure 5. Standard SOA chart. 


In Figure 5, the area showing a thermal instability has not been included. To show the region of 
instability requires a fifth boundary, as shown in Figure 6, and changes the shape of the maximum power 
limit. 
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1. Resistance Limit 

2. Max. Current Limit 

3. Max Voltage Limit 

4. Max Power Limit with Time 

5. Thermal Instability 


Figure 6. Standard SOA chart including region of instability. 


At low gate voltages, conduction in MOSFETs is charge-carrier dominated. Charge-carrier 
concentration increases at higher temperatures and positive feedback allows thermal runaway. At high 
gate voltages, MOSFETs are mobility dominated. The charge- carriers mobility decreases as temperature 
increases. 


Mobility carriers are less mobile at high temperatures. In older devices, the mobility effect “RULED” 
over the SOA. Today, charge carriers effect spill over from the right side of the SOA. 

3.0 A Derivation of the Stability Criterion for Thermal Runaway, and the 
Spirito Effect 

3.1 Purpose 

The purpose of this section is to discuss the conditions under which thermal runaway can happen to a 
power MOSFET. 

Subsection 3.2 considers only the case that the die is perfectly uniform over each vertical cross 
section. 

In following sections, localized imperfections are shown to result in the formation of a ‘hot spot’ — a 
zone extending over only a few neighboring cells — that rapidly heats to destruction. 

3.2 Uniform Cross Sections 


If the device, and especially the die inside, is uniform from edge-to-edge over each vertical cross 
section in all its material properties and construction properties, There is particular interest in the 
temperature of the top surface of the die, 7}, where the Field-Effect Transistor (FET) channels occupy the 
upper few micrometers. 1 


Many semiconductor devices have a /7/2-junction, whose temperature affects the device’s behavior; the ‘junction temperature’ 

Tj. An /2-type FET has an 77 -conduction channel connecting an 22-source to an /2-sink; the cross-sectional area of this channel is 
changed by the electrical action of the gate. Flence, only material of the same type is active, and there is no ‘junction’ involved in 
this conduction. 
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3.2.1 Thermal resistance: part one 


The temperature of the top surface of the die containing the ‘sea’ of FET-cells has an important effect 
on the behavior of the power MOSFET. Applied power deposits heat into this surface, while the thermal 
conductance of the die (and die attach, etc.) conducts heat from this surface: its temperature at the time t 
is a function of the history of the power applied up to the moment t. One particular history is to fix the 
case temperature T c to a constant value (using sufficiently aggressive thermal clamping of the case) and 
wait until the entire device is at this case temperature. Then, apply a constant power P to the device, 
starting at t = 0 and continuing. The ratio of the rise in temperature of the top surface, 7), to the applied 
power /’is defined as the thermal resistance between ‘junction’ and case of the device: 


Rth(t) 


Tj(t) — T e 
P 


( 1 ) 


This must begin at zero since 7}{f = 0) = T c . It typically rises as y/t until the thermal pulse reaches the 
bottom of the case (this takes roughly h 2 / D where h is the thickness of the die and D is the thermal 
diffusivity of silicon), after which it saturates to a constant value that is typically within a half-order of 
magnitude of 1°C/W. 


Another particular history is to apply the power as a sequence of identical pulses with a fixed duty 
factor D. A duty factor of zero, D= 0, returns the ‘single pulse’ resulting in: R t h{ t\U = 0) = R t /,( f ) . For a 
larger duty factor, the surface of the die is still somewhat heated by the previous pulses as the ‘reference’ 
cycle starts, as a result the value of R t h{t\D > 0) starts at a higher value than zero; and R t h(t\D = 1) is 
constant at its saturated value. 


Most data sheets offer a plot of the thermal resistance for a sequence of ‘constant power’ pulses of 
various duty factors from D= 0 (the single pulse case) upward. See Figure 7 [1] for an example. 



Figure 7. A typical plot of junction-to-case thermal resistance of a power MOSFET (IRF510). The horizontal axis is 
‘time fin seconds’ when a constant power is dissipated; the vertical axis is ‘temperature rise at the time t of the cells 
of the die per power deposited,’ in °C/W ; and the various curves are for various ‘duty cycles.’ 

Other power-histories result in other behaviors of the temperature of the top surface 7}{f): see section 
(below). 
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3.2.2 Curve traces 


• plot of Id vs Vd and V g 

• concentration on region for large V d . saturation of Id vs V/- so /,/dcpcnds(almost) entirely on V g 

• re -plot of Id vs V g at various values of T 

• observation that did/ dT can be positive 

• re -plot of /rf vs Tat fixed V g , and observation of Taylor’s series 


3.2.3 Derivation of the condition of thermal runaway 

The dependence of the drain current on gate voltage is modeled with usable accuracy by a linear function: 


u(y d ,v g ,Tj ) = 

a(V d ,V g ,T c ) = 


Ia(y d ,V g ,T c ) + a (V d ,V g ,T c ) •( Tj - T c ) 
dial 


dT 


Vd.Vg 


( 2 ) 

( 3 ) 


The notation will be frequently simplified by suppressing the independent variables, using 

I l d ] = IdiVd.VgJc), R th = Rth(t) anda = a i V d’ Vg> Tc) 


What happens as power is applied at constant drain and gate voltages can be anticipated, when a > 0. 
As the junction temperature rises, the current rises increasing the power dissipated in the junction, which 
increases its temperature even more causing an additional rise in current. If this ‘positive feedback’ is 
large enough, the junction temperature can runaway to a disastrous value. 

Holding the drain and gate voltages constant, the rise in the junction temperature is computed as 
follows: 


c = Rmtt) -p(f) 


(4) 

£ 

40 

-g 

o? 

II 

Id (Vd.Vg, Tj(t )) 

(5) 

- RmiO ■ v d ■ 

[l [ d c] + a-(Tj(t)-T c )\ 

(6) 

£ 

40 

-ci 

Q? 

II 

4 C] +R th (t)-V d -a-[Tj(t)-T c ]. 

(7) 

[Tj(t) - T c \ ■ [1 - 

■Rm(0 ■ V d • a] = R th (0-V d -lP, 

(8) 
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and this gives the ‘large enough’ result: 


Tj (t) — T c 


Rth • v d • id 1 
1 — Rth ' Vd ' a 


( 9 ) 


The numerator is the temperature rise that would happen if the power were held constant 3 at 
/^ c = V d • I^\ The denominator determines occurrence of thermal runaway. The term 7? t/l (t) ■ V d ■ a 
is zero at the start, t = 0, and increases with time. If it remains less than unity, the rise in the temperature 
of the junction, 7X0 — T c , remains bounded; however, the junction temperature T/t) diverges to infinity if 
the term approaches unity. See Figure 8 [1]. 


To define the dimensionless stability factor S: 

S = S(t,V d ,V g ,T c ) = R th (t ) • V d ■ a{V d ,V g ,T c ) = R th ■ V d ■ a (10) 

Then Equation 9 states that the temperature of the surface of the die is bounded to a finite value when 
S< 1, and diverges to infinite values when S> 1: 


S < 1: T ) is bounded, (11) 

S > 1: T ‘j diverges to infinity. (12) 

This is the condition celebrated by P. Spirito, and shown by him [4] (and by a number of others as 
well) to mark a curve on the SOA beyond which power MOSFETs show thermal runaway ending in 
catastrophic failure 3 . These are called Spirito-mode failures. 



Figure 8. illustration of the rise in 7} as the drain voltage E/ is increased in steps from 5 V to 15 V, and the gate 
voltage Vg at each step is held constant at a value that makes the initial value of drain current I,/ — 90 W/ Vd. Note 
that 7} diverges when V d is between 12 V and 13 V. Also shown is the rise in Tj at a constant power of 90 W: 
7/(0 = 18°C + 7? t ft(t) ' 90 W. Values are for ‘device 1’ in Reference [4]. 


2 ... . 

~ This is not strictly true, since the thennal resistance actually has a temperature dependence, decreasing as the junction 
temperature increases; precise modeling must take this into account. 

3 This criterion for thermal runaway is attributed by Marie Denison (et al.; [2]) to P. L. Hower and P. K. Govil [3], 




The surface of the die need not reach an infinite temperature in order to be destroyed. Silicon melts at 
1410°C. The aluminum traces are melted at 660°C and the eutectic temperature of aluminum-silicon 
alloys is 577°C. The aluminum traces on the surface of the die interdiffuse with the silicon below 577°C, 
and the time for this to become destructive is temperature dependant; there is enough interdiffusion at 
450°C that a few minutes at this temperature is used to obtain low contact resistance for Al-Si contacts. 
On the other hand, a few hours at 400°C has no effect. FET manufacturers generally list 150°C or 175°C 
as the absolute maximum temperature r ['. AhsMax f or the complete device; none mention the median time to 
degradation. 


In summary, failure by ‘cooking’ the top surface can happen while S’ is less than unity: divergence of 
7} is not actually required. Having S< 1 does not guarantee the device is being operated safely; rather, the 
stability factor must be sufficiently less than unity that 7} never approaches 450°C, and users should 
require that 7} never exceeds the manufacturer’s value for j AbsMax . 

Figure 8 shows that 7} = 450°C is reached at slightly less than a second at V d = 9 V for Spirito’s 
device 1 (described in Reference [4]), and at about a tenth of a second at 12 V; however, divergence to 
infinite temperatures does not happen until Vd exceeds 12 V. The drain voltage must be 5 V (or less) to 
ensure 7) remains less than 175°C. Hence, ignoring the small increase in thermal resistance after 0.1 
second. Devastating interdiffusion could happen when S ~ (9 V/12 V) = 0.7, and caution in this case 
requires S < (5 V/12 V) = 0.4. 


Clamping the voltages Vd and V g , and the case temperature T c , does not clamp the drain current Id or 
the power P = Vd • Id to constant values. The drain current and the power change with the changing 
temperature of the surface of the die, 7}(f), result in: 


Tj(t) = T C + 


RthW-pW . 


rH R th (t) ■ pM • a 


= / 


1-S 

[C] 


1 -si 


(13) 

(14) 


and 


P(t) = V d -I d (t) = pM- 


1 + 


Ptn(t) • V d • a 
1 - R th (t ) • V d ■ a 


= pic] . 


1-S 


(15) 


The usual method of reporting a SOA by using the drain current as the vertical axis is misleading The 
device does not continue to operate at the same Vr, 4r point when thermal runaway happens; rather, this 
point becomes a vertical trajectory. It may start at a region that is ‘safe’ (in that Tj would always remain 
below T. AbsMax if J d retained that original value) but then move into a region that ‘cooks’ the die. Thus, 
the set of starting V d -, /^-points that always stay ‘safe’ must be identified as they evolve with time, and 
plotted on the SOA diagram. 


3.2.4 HotSpots 

The criterion for thermal runaway, S > 1, is derived above on the assumption that the entire surface of 
the die is heating uniformly. Indeed, a uniform die is the condition for defining the thermal resistance 
R t h( f) that appears in the criterion. 
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Actual devices do not have cross sections that are rigorously uniform in properties and behaviors. 
There are often voids in the die attach. When a void is larger than roughly the thickness of the die, it 
substantially increases the thermal resistance of the path starting at the die’s surface and passing vertically 
thru that void to get to the case. There are also local variations in doping and fabrication-geometry that 
result in some cells having a higher threshold voltage; such cells conduct less current than others with 
lower threshold voltage. The current-supply leads bonded to the top of the die provide an additional 
thermal path to case, and thus lower the thermal resistance of the zone at and around each lead attachment 
location. A large enough pulse of drain current flooding into the aluminum traces on the surface of the die 
can induce a local voltage drop across these traces so that Vd is larger in the zone where the current- 
supply leads are attached. Thus, S does not have a constant value over the cross section of an actual die. 
Even when i’is safely less than unity across most of the die’s cross-section, it may still be unsafe at some 
particular zone. This can be a source of part-to-part variations in resistance to Spirito-mode failures. 

As the entire surface begins heating into a thermal runaway process per the Spirito-mode, some zones 
will heat faster than the typical zone; one of these will develop into a hot spot. It will ‘hog’ current from 
its neighbors (all at the same applied Vd and V g ) allowing it to accelerate beyond them toward 
dangerously high temperatures. Inspection of the destroyed device shows a small melted zone. 

Much of the literature discussing Spirito-mode failures calls attention to the development of these hot 
spots and may give the impression that they are the cause of this class of failures instead of a natural 
consequence of the final moments of the process. However, obtaining devices that are completely free of 
non-uniformities would not eliminate the Spirito-mode failure, which is present (per Equation 9) even in a 
completely uniform device. 

4.0 Effect of P on the Spirito Criterion for Thermal Runaway of a Power 
MOSFET 

4.1 Apply Power: The MOSFET Heats 

One way to apply power to a power MOSFET is to hold constant both the drain voltage Vd and the 
drain current Id. (This requires changing the gate voltage V g as the die heats.) The power deposited into 
the ‘sea of FETs’ on the surface of the die is then a constant, P= Vdld, and the temperature of the ‘sea’ is 
given by the thermal resistance 

Tit ) = no) + Rm(f)p = no) + R t h(f)v d i d (i6) 

This power can be applied until the ‘sea’ heats to a dangerous extent; heating longer will damage the 
device. There is a relationship between what maximum temperature is ‘safe’ (i.e., the device will be 
unchanged when it returns to a noimal temperature) or ‘unsafe,’ and the duration of the time the ‘sea’ is 
held at that high temperature. If the duration is less than a microsecond, then perhaps reaching 350°C is 
safe. If the duration is longer than seconds, then perhaps 175°C is safe; but, some manufacturers report 
Tsafe = 150°C. Using the latter value and 7(0) = 25°C, the time (fc 3 /e) during which the power can be 
applied, while limiting the temperature rise to the ‘safe’ value, is found by solving: 

, 125°C 

Rthipsafe) ~ i i t (17) 

l d v d 

An estimate of the thermal resistance Rthii ) is derived from the data sheet for the part, and the inter- 
relationship between t sa fe, 4/ and Vd can be used to draw the ‘safe power’ line on the SOA diagram. 
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A different way to apply power to a MOSFET is to maintain both the drain voltage V d and the gate 
voltage V g constant. As the ‘sea’ heats, the drain current changes (determined by curve tracing at various 
temperatures). If the drain current decreases with increasing temperature, the device is stable. The 
contrary means the device can suffer ‘thermal runaway.’ 

If the dependence of the drain current is described using a Taylor’s series: 

I d (V d , Vg, T ) = 4 0] + a ■ (T — To ) + p ■ (T — T 0 ) 2 + 0(3), (18) 


where 

To = 7(0) is the temperature, such as 25°C, of the ‘sea’ at t= 0. 

4 0] = i d (y d ,Vg,T 0 ). 
cLI d 

a = evaluated at V d , V g , T 0 . 

1 cL 2 I d 

P = 2~dj2 evaluated at V d , V g , T 0 . 

If /?■ ( T— To) 2 is small enough compared with a ■ ( 7'— To), the ‘/f -term can be ignored. 
Then algebra gives: 


T(f) - T 0 = 


l-S 


(19) 


where S= aV d R t h(f) is the Spirito stability factor, and where the affect of the time-changing power on the 
thermal resistance is thought to be ignorable. As S— > oo, the temperature diverges to infinite values; this is 
an unstable case. For this to happen, a must be positive (the drain current I d must increase with increasing 
temperature)and a must be sufficiently positive that the triple-product S approaches unity. 


The power MOSFET is unsafe long before S —> oo. All that is necessary is 7(f) reach T sa fe. This 
happens when 


R th(j-safe ) — (1 "0 


Tsafe T 0 

w fM ' 
V d I d 


The ‘safe’ time is reduced when a> 0 and increased when it is negative. 


If 0, then 


no - t 0 



.A. 


1 

1 

T— 1 

1 ^ 

h 

1 

i 

T— 1 


( 20 ) 


( 21 ) 
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This expression reduces to the when f3 — > oo and it has the singularity at precisely the same condition 
( S— > oo). The numerical details are affected by whether ft is positive or negative. 

This expression can be solved for the time t sa f e this ‘sea’ takes to reach T sa f e . 

Note: the modeling of 7(7) and of t sa f e have been improved by including fi but the modeling has been 
degraded in a different way: once the drain current Id depends on time, the power dissipated into the ‘sea’ 
is not constant with time; rather, this power may increase (a> 0) or it may decrease (a < 0) during the 
pulse. This means the temperature is no longer given by Equation 17. Recomputing must be done to find 
the ‘correct’ heating of the ‘sea.’ 


5.0 MOSFET Testing Results and Information Dissemination 

5.1 Nondestructive Test for MOSFET Thermal Instability 

5.1.1 Assumptions Made 

With an understanding of the failure mechanism, it is time to investigate what is needed to record 
actual test data. Theoretically, several assumptions are made about the beginning of the failure that are not 
proven at the end of the parts failure. Throughout the theoretical section, one underlying assumption (that 
is the assumption that the MOSFET heats evenly throughout the event) seems incorrect. Reviewing test 
data taken when the lid is removed, the MOSFET does remain uniform across the die to within ten 
percent. It is only when the part is starting to fail that the temperature deviates across the MOSFET. The 
second assumption being made is that the intrinsic diodes internal to the part show the temperature of the 
hottest point in the MOSFET when a small reverse current is applied to the intrinsic diodes. Again, this 
works well as long as the temperature remains uniform. Once a hot spot starts to appear, the temperature 
reported by the diodes start to fall off from externally made measurements. Unfortunately, methods used 
to read the temperature independent from the part require the part to be destroyed (delidded). With the 
exception of finding the thermal resistance, all testing is to be done while trying to avoid changing the 
temperature of the MOSFET above the starting point of the MOSFET. Lastly, it is assumed that all parts 
tested are to be tested non-destructively. 

5.1.2 Data Needed From Test 

The intent of the testing is to collect four sets of data. This data can be used to analyze the location 
within the SOA where a MOSFET will have thermal instability problems. The four sets are: 

1. Temperature of the MOSFET ( 7). 

2. The voltages on the MOSFET ( Vd). 

3. Current going through a MOSFET at known Gate voltages at different known temperatures (Id). 

4. Thermal Resistance; the temperature increase for a given power for a given period of time (R t h). 
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Test Equipment 

While it would be very easy to state the steps used to determine these values, a more generic 
description is being attempted to avoid the need for having any specific piece of equipment. 

Note to determine a and even /?, voltage and current are needed at the same time. An / V curve tracer 
servers the purpose well if it is modified to drive the test parts to different temperatures. 

MOSFET Temperature 

All measurements need to originate from the MOSFET’ s internal temperature. With one exception, the 
temperature required is located inside the sealed part, hidden from direct observation. This exception is 
the die itself. The intrinsic diode (part of all MOSFETs) can be used to determine the temperature of the 
MOSFET. A schematic for intrinsic diode temperature measurement is shown in Figure 9. By placing a 
small negative current (1mA to 10mA.) between the Drain and Source, a voltage can be read that 
correlates with the temperature. The voltage change should be near 0.002 volts per °C for a 1 0mA source. 
The actual voltage change can be determined by running the test-current source through the intrinsic 
diodes at several known temperatures, and building up a table of the temperatures and voltages, or by 
determining the slope and offset of the function. The temperature should be well controlled, and there 
should be no power in the MOSFET other than the 1 0mA test current. If air is being used to control the 
temperature of the part, the part should be monitored to determine when the part is in equilibrium with the 
moving air. Equalizing the part’s temperature and the air stream may take up to 20 minutes per 
temperature step and should be verified by the internal current (voltage) measurement. At a minimum, 
verified temperature measurement should be made at 0°C, 25°C, and 100°C. Other temperature 
measurements can be extracted from the voltage measurements. 



Figure 9. Simplified schematic for intrinsic diode temperature measurement. 


Voltage in the Device at Set Temperatures 

With a way of knowing the internal MOSFET temperature, being able to determine the Drain to 
Source voltage is required. The same meter can be used for both temperature and voltage if a second 
switch is employed to stop the main current following the test time. This is needed due to the reversal in 
current. The temperatures will read as negative voltages and the Vds voltage will appear positive. The 
meter used is required to take measurements very quickly and hold the reading. The intention of the short 
measurement times is to reduce the amount of heat built up in the MOSFET. For voltage and current 
measurements at temperature, heating of the MOSFET needs to be kept to a minimum. Increases in 
temperature should be less than twice the noise floor of the voltmeter (5°C over a series of test pulses, 10 
pulses in 10 seconds). To keep the temperature unchanged, the test pulses should be limited to 
approximately 20uSec. Voltage probe points should be as close to the test MOSFET as possible, and the 
probes should not be used to carry current. These need to be Kelvin connections. 
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Along with using Kelvin connections, it is necessary to have a very stiff voltage supply. With 
resistances being very low for the total circuit and low on resistances internal to the part, large capacitors 
close to the switches is recommended. A large capacitor will help lower the need for heavy wire running 
to a power supply 

Depending on the method of taking the voltage there may be a problem with the measurements. Most 
oscilloscopes are limited in voltage measurements to values that are on the screen. If the trace goes off 
screen the values become saturated, requiring time for the scope to recover once the signal is back on the 
screen. The oscilloscope’s recovery time can be verified by the manufacturer. Signals that fall within this 
time interval will be in error, and cannot be used or trusted. Using a single channel set to a large voltage 
level to capture both the temperature reading and the Vd S value will increase the noise in the temperature 
readings. The use of two channels may provide a better solution. If a smaller scale is used for the 
temperature reading a small signal diode and large resistor can be used to block the higher forward 
voltage. 

Current at the Set Temperature 

The current required is the current flowing through the MOSFET (A ). A schematic for A current 
measurement can be seen in Figure 10. The use of a large capacitor in the system not only helps with 
holding the voltage stable, but also allows the current to flow by reducing the total impedance path during 
the on pulse. The current reading needs to be taken inside of the capacitor, MOSFET circuit. A current 
shunt could be used if it is of sufficient size to allow for high currents (20 A to 30 A). However, a current 
shunt will add resistance to the over all system, and could easily double the resistance of the circuit. For 
this reason a current probe is recommended. 



Supply 


Figure 10. Simplified schematic for Acurrent measurement. 


Thermal Resistance 

Thermal resistance is a time dependant measurement of the temperature of the MOSFET die with a 
given power input. For short periods of time the change in temperature remains in the die itself. With 
slightly longer periods of time the temperature rise will travel out of the die into the base of the part. This 
process is repeated again for the heat traveling into the printed circuit (PC) board, and then again into the 
PC board’s heat sink. The manufacturer has no way of knowing the thermal path outside of their part’s 
case and cannot advertise the total thermal resistance for time periods greater than 1 second. It is the total 
thermal resistance that is required if the part is being used for long time periods. It should also be noted 
that some MOSFET thermal resistance charts have been found to be incorrect for the stated parts. 
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5.2 Test Data Development of Data 

With the four sets of data discussed in Section 3.1.2, it is possible to record values for a and j5. With 
this data, the Spirito Stability Criteria can be determined. 

It should be noted that a and /? are curves as a function of V gs or even Id. For derating of the stability 
criteria, the maximum values of a and /? should be used. Testing to date has shown ft to be about 100 
times smaller in value from a. 

5.2.1 Alpha 

If a curve tracer has been used, a can be found by selecting a voltage common to two temperature data 
sets and gate voltages, subtracting the lower temperature data point from the higher temperature data 
point, and dividing by the temperature difference of the two data points. The a curve will be the 
collection of current differences verses the gate voltages. The temperature steps used should be limited to 
no greater than 25°C. Smaller temperature steps would be better. 

5.2.2 Beta 

If a is found over a series of temperature steps, (3 will be the differences in a with an increase in 
temperature. 

6.0 Executive Summary 

Based on recent testing and failure investigations, it appears the “old” SOA application curves are 
inaccurate with regard to the SOA of some MOSFET parts. These parts are used extensively in flight 
hardware and ground support equipment. 

With the push for faster switching, lower on resistance power MOSFETs, came an unintended 
consequence similar to, but not seen since the prime of the bipolar transistor, which was the secondary 
voltage breakdown effect. While MOSFETs are in the charge-carrier dominated region (low V gs ) the 
MOSFET allows more current to flow as the temperature increases causing a thermal runaway. It was 
discovered that the SOA curves given by the manufacturers were lacking in giving the region of thermal 
instability. A review of papers from the automotive industry is described, and recommendations to add 
the area of thermal instability are included. The four factors that are important in determining the thermal 
instability are: 

1. a defined as the change in current over the change in temperature, (d// dl) 

2. [3 defined as the acceleration of current over the change in temperature. (d**2* // dl**2) 

3. Thermal resistance of the surface of the MOSFET which is the change in temperature over the 

power in the part. 

4. The voltage across the MOSFET from the Drain to the Source. 
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